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WHAT IS CLAIMED IS: 

1. An apparatus for executing at least one single 
program multiple data (SPMD) program in a microprocessor, 
said apparatus comprising: 

a micro single instruction multiple data (SIMD) unit 
located within said microprocessor; and 

a job buffer having an output coupled to an input of 
said micro SIMD unit; 

wherein said job buffer dynamically allocates tasks to 
said micro SIMD unit. 

2. The apparatus as set forth in Claim 1 wherein 
said micro SIMD unit is capable of sending job status 
information to said job buffer. 

3. The apparatus as set forth in Claim 1 wherein 
said at least one SPMD program comprises a plurality of 
input data streams having moderate diversification of 
control flows. 
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4 . The apparatus as set forth in Claim 3 wherein 
said apparatus executes said at least one SPMD program once 
for each input data stream of said plurality of input data 
streams . 

5. The apparatus as set forth in Claim 4 wherein 
said apparatus generates an instruction stream for each 
input data stream of said plurality of input data streams. 

6. The apparatus as set forth in Claim 3 wherein 
said apparatus executes a plurality of SPMD programs and 
wherein each SPMD program of said plurality of SPMD 
programs is executed on a number of input data streams. 

7. The apparatus as set forth in Claim 6 wherein 
said number of input data streams is greater than a program 
granularity threshold. 

8. The apparatus as set forth in Claim 1 wherein 
said job buffer dynamically allocates tasks to said micro 
SIMD unit by dynamically bundling jobs to be executed based 
on a control flow equivalence of said jobs. 
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9. The apparatus as set forth in Claim 8 wherein 
said apparatus performs job clustering to form a job bundle 
in which each job in said job bundle has an equivalent 
control flow. 

10. The apparatus as set forth in Claim 9 wherein 
said apparatus performs said job clustering based on a job 
processing status of said jobs in said job bundle. 

11. The apparatus as set forth in Claim 8 wherein 
said apparatus forces a task to terminate at a point where 
a job control path might fork by placing a code -stop in 
said task. 

12. The apparatus as set forth in Claim 11 wherein 
said apparatus minimizes a required number of code-stops to 
be placed in said task by excluding from code-stop 
placement each control flow statements that is equivalent 
to a select instruction. 
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13 . The apparatus as set forth in Claim 9 wherein 
said apparatus maximizes a size of a job cluster by 
selecting tasks for execution in which a job processing 
status of each of said tasks is complete. 

14 . The apparatus as set forth in Claim 8 wherein 
said apparatus executes a data loading phase for a task 
before said apparatus executes a task execution phase for 
said task. 
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15. A method for executing at least one single 
program multiple data (SPMD) program in a microprocessor, 
said method comprising the steps of: 

providing a micro single instruction multiple data 
(SIMD) unit located within said microprocessor; 

providing a job buffer having an output coupled to an 
input of said micro SIMD unit; and 

dynamically allocating tasks to said micro SIMD unit 
in said job buffer. 

16. The method as set forth in Claim 15 further 
comprising the step of: 

sending job status information from said SIMD unit to 
said job buffer. 

17. The method as set forth in Claim 15 wherein said 
at least one SPMD program comprises a plurality of input 
data streams having moderate diversification of control 
flows . 

18. The method as set forth in Claim 17 further 
comprising the step of: 

executing said at least one SPMD program once for each 
input data stream of said plurality of input data streams. 
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19. The method as set forth in Claim 18 further 
comprising the step of: 

generating an instruction stream for each input data 
stream of said plurality of input data streams. 

20. The method as set forth in Claim 17 further 
comprising the steps of: 

executing a plurality of SPMD programs; and 
executing each SPMD program of said plurality of SPMD 
programs on a number of input data streams. 

21. The method as set forth in Claim 20 wherein said 
number of input data streams is greater than a program 
granularity threshold. 

22. The method as set forth in Claim 15 wherein said 
job buffer dynamically allocates tasks to said micro SIMD 
unit by dynamically bundling jobs to be executed based on a 
control flow equivalence of said jobs. 
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23. The method as set forth in Claim 22 further 
comprising the step of: 

performing job clustering to form a job bundle in 
which each job in said job bundle has an equivalent control 
flow. 

24. The method as set forth in Claim 23 further 
comprising the step of: 

performing said job clustering based on a job 
processing status of said jobs in said job bundle. 

25. The method as set forth in Claim 22 further 
comprising the step of: 

forcing a task to terminate at a point where a job 
control path might fork by placing a code-stop in said 
task. 

26. The method as set forth in Claim 25 further 
comprising the step of: 

minimizing a required number of code -stops to be 
placed in said task by excluding from code- stop placement 
each control flow statements that is equivalent to a select 
instruction. 
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1 27. The method as set forth in Claim 23 further 

2 comprising the step of: 

3 maximizing a size of a job cluster by selecting tasks 

4 for execution in which a job processing status of each of 

5 said tasks is complete. 

1 28. The method as set forth in Claim 22 further 

2 comprising the step of: 

3 executing a data loading phase for a task before 

4 executing a task execution phase for said task. 
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